As Artificial and Robotic Systems are increasingly deployed and relied upon for real-world applications, it is important that they exhibit the ability to continually learn and adapt in dynamically-changing environments, becoming Lifelong Learning Machines. Continual/lifelong learning (LL) involves minimizing catastrophic forgetting of old tasks while maximizing a model's capability to learn new tasks. This paper addresses the challenging lifelong reinforcement learning (L2RL) setting. Pushing the state-of-the-art forward in L2RL and making L2RL useful for practical applications requires more than developing individual L2RL algorithms; it requires making progress at the systems-level, especially research into the non-trivial problem of how to integrate multiple L2RL algorithms into a common framework. In this paper, we introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF), which standardizes L2RL systems and assimilates different continual learning components (each addressing different aspects of the lifelong learning problem) into a unified system. As an instantiation of L2RLCF, we develop a standard API allowing easy integration of novel lifelong learning components. We describe a case study that demonstrates how multiple independently-developed LL components can be integrated into a single realized system. We also introduce an evaluation environment in order to measure the effect of combining various system components. Our evaluation environment employs different LL scenarios (sequences of tasks) consisting of Starcraft-2 minigames and allows for the fair, comprehensive, and quantitative comparison of different combinations of components within a challenging common evaluation environment.
translated by 谷歌翻译
为了在具有快速收敛和低内存的边缘设备上学习,我们提出了一种新型的无反向传播优化算法,称为目标投影投影随机梯度下降(TPSGD)。 TPSGD将直接的随机目标投影概括为使用任意损失函数,并扩展训练复发性神经网络(RNN)的目标投影,此外还有其他损失函数。 TPSGD使用层的随机梯度下降(SGD)和通过标签的随机投影生成的局部目标来训练网络逐层,仅通过正向传递。 TPSGD在优化过程中不需要保留梯度,与SGD反向传播(BP)方法相比,记忆分配大大降低了,这些方法需要整个神经网络权重,输入/输出和中间结果的多个实例。我们的方法在相对较浅的层,卷积层和经常性层的相对较浅的网络上,在5%的精度内的BP梯度降低性能相当。 TPSGD还胜过由多层感知器,卷积神经网络(CNN)和RNN组成的浅层模型中的其他最先进的无梯度算法,具有竞争力准确性,记忆力和时间更少。我们评估TPSGD在训练深神经网络(例如VGG)中的性能,并将方法扩展到多层RNN。这些实验突出了与使用TPSGD在边缘的TPSGD进行域转移的优化基于层的适配器训练有关的新研究方向。
translated by 谷歌翻译
应对深层终身强化学习(LRL)挑战的一种方法是仔细管理代理商的学习经验,以学习(不忘记)并建立内部元模型(任务,环境,代理商和世界)。生成重播(GR)是一种以生物学启发的重播机制,可以通过从内部生成模型中绘制的自标记示例来增强学习经验,该模型随着时间的推移而更新。在本文中,我们提出了一个满足两个Desiderata的GR版本:(a)使用深RL学习的策略的潜在策略的内省密度建模,以及(b)无模型的端到端学习。在这项工作中,我们研究了三个无模型GR的深度学习体系结构。我们在三种不同的情况下评估了我们提出的算法,其中包括来自Starcraft2和Minigrid域的任务。我们报告了几个关键发现,显示了设计选择对定量指标的影响,包括转移学习,对看不见的任务的概括,任务更改后的快速适应,与任务专家相当的绩效以及最小化灾难性遗忘。我们观察到我们的GR可以防止从深层批评剂的潜在矢量空间中的特征映射中漂移。我们还显示了既定的终身学习指标的改进。我们发现,当与重播缓冲液和生成的重播缓冲液结合使用时,需要引入一个小的随机重放缓冲液,以显着提高训练的稳定性。总体而言,我们发现“隐藏的重播”(一种众所周知的班级入学分类体系结构)是最有前途的方法,它推动了LRL的GR中最新的方法。
translated by 谷歌翻译
在本文中,我们介绍了战术边缘(水合物)的高维可重构分析,使用低S型嵌入式硬件可以在利用非MAC的边缘进行实时重新配置(不含浮点多裂动作)(无浮点多裂动作)(深神经网络)( DNN)结合了高度(HD)计算加速器。我们描述了算法,经过训练的量化模型生成以及功能提取器的模拟性能,不含多重蓄能的供您喂养基于高维逻辑的分类器。然后,我们展示了性能如何随着超数的数量而增加。我们将与传统DNN相比,描述已实现的低压FPGA硬件和嵌入式软件系统,并详细介绍实现的硬件加速器。我们讨论了测量的系统延迟和功率,由于使用可学习的量化和高清计算而引起的噪声稳健性,用于视频活动分类任务的实际和模拟系统性能以及在同一数据集上进行重新配置的演示。我们表明,仅使用梯度下降反向传播(无梯度)的馈电HD分类器(无梯度),可以通过使用几乎没有射击的新课程来实现现场的可重构性。最初的工作使用了LRCN DNN,目前已扩展到使用具有改进性能的两流DNN。
translated by 谷歌翻译
我们研究了如何使用来自生物视觉的扫视机制来使深层神经网络更有效地用于分类和对象检测问题。我们提出的方法是基于注意力驱动的视觉处理和扫视的思想,由注意力影响的微型眼动。我们通过分析进行实验:i)不同的深神经网络(DNN)特征提取器的鲁棒性对部分感知图像进行图像分类和对象检测,以及ii)acccades在掩盖图像贴片中用于图像分类和对象跟踪的效用。在几个数据集(CIFAR-10,DAVSOD,MSCOCO和MOT17)上进行了卷积网(RESNET-18)和基于变压器模型(VIT,DETR,TRANSTRACK)的实验。我们的实验显示了通过学习与最先进的DNN一起用于分类,检测和跟踪任务时模仿人类扫视的智能数据减少。我们观察到分类和检测任务的性能下降最少,而仅使用约30 \%的原始传感器数据。我们讨论扫视机制如何通过``像素''处理来为硬件设计提供信息。
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
We present a dynamic path planning algorithm to navigate an amphibious rotor craft through a concave time-invariant obstacle field while attempting to minimize energy usage. We create a nonlinear quaternion state model that represents the rotor craft dynamics above and below the water. The 6 degree of freedom dynamics used within a layered architecture to generate motion paths for the vehicle to follow and the required control inputs. The rotor craft has a 3 dimensional map of its surroundings that is updated via limited range onboard sensor readings within the current medium (air or water). Path planning is done via PRM and D* Lite.
translated by 谷歌翻译
While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.
translated by 谷歌翻译
We present Muse, a text-to-image Transformer model that achieves state-of-the-art image generation performance while being significantly more efficient than diffusion or autoregressive models. Muse is trained on a masked modeling task in discrete token space: given the text embedding extracted from a pre-trained large language model (LLM), Muse is trained to predict randomly masked image tokens. Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding. The use of a pre-trained LLM enables fine-grained language understanding, translating to high-fidelity image generation and the understanding of visual concepts such as objects, their spatial relationships, pose, cardinality etc. Our 900M parameter model achieves a new SOTA on CC3M, with an FID score of 6.06. The Muse 3B parameter model achieves an FID of 7.88 on zero-shot COCO evaluation, along with a CLIP score of 0.32. Muse also directly enables a number of image editing applications without the need to fine-tune or invert the model: inpainting, outpainting, and mask-free editing. More results are available at https://muse-model.github.io
translated by 谷歌翻译
The visual dimension of cities has been a fundamental subject in urban studies, since the pioneering work of scholars such as Sitte, Lynch, Arnheim, and Jacobs. Several decades later, big data and artificial intelligence (AI) are revolutionizing how people move, sense, and interact with cities. This paper reviews the literature on the appearance and function of cities to illustrate how visual information has been used to understand them. A conceptual framework, Urban Visual Intelligence, is introduced to systematically elaborate on how new image data sources and AI techniques are reshaping the way researchers perceive and measure cities, enabling the study of the physical environment and its interactions with socioeconomic environments at various scales. The paper argues that these new approaches enable researchers to revisit the classic urban theories and themes, and potentially help cities create environments that are more in line with human behaviors and aspirations in the digital age.
translated by 谷歌翻译